Python

Python is widely used general-purpose high-level programming language. Its design philosophy emphasizes code readability. It is very popular in science.

Jupyter

The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.

  • Evolved from IPython notebook
  • In addition to Python it supports many other programming languages (Julija, R, Haskell, etc..)
  • http://jupyter.org/

Getting started

Anaconda/Conda (need to install)

Web hosted (only need a web browser)

The notebook

Cell types - markdown and code

This is Markdown cell


In [1]:
print('This is cell with code')


This is cell with code

In [ ]:

Variables, lists and dictionaries


In [2]:
var1 = 1
my_string = "This is a string"

In [3]:
var1


Out[3]:
1

In [4]:
print(my_string)


This is a string

In [5]:
my_list = [1, 2, 3, 'x', 'y']
my_list


Out[5]:
[1, 2, 3, 'x', 'y']

In [6]:
my_list[0]


Out[6]:
1

In [7]:
my_list[1:3]


Out[7]:
[2, 3]

In [8]:
salaries = {'Mike':2000, 'Ann':3000}

In [9]:
salaries['Mike']


Out[9]:
2000

In [10]:
salaries['Jake'] = 2500

In [11]:
salaries


Out[11]:
{'Ann': 3000, 'Jake': 2500, 'Mike': 2000}

Strings


In [12]:
long_string = 'This is a string \n Second line of the string'

In [13]:
print(long_string)


This is a string 
 Second line of the string

In [14]:
long_string.split(" ")


Out[14]:
['This', 'is', 'a', 'string', '\n', 'Second', 'line', 'of', 'the', 'string']

In [15]:
long_string.split("\n")


Out[15]:
['This is a string ', ' Second line of the string']

In [16]:
long_string.count('s') # case sensitive!


Out[16]:
4

In [17]:
long_string.upper()


Out[17]:
'THIS IS A STRING \n SECOND LINE OF THE STRING'

Conditionals


In [18]:
if long_string.startswith('X'):
    print('Yes')
elif long_string.startswith('T'):
    print('It has T')
else:
    print('No')


It has T

Loops


In [19]:
for line in long_string.split('\n'):
    print line


This is a string 
 Second line of the string

In [20]:
c = 0
while c < 10:
    c += 2
    print c


2
4
6
8
10

List comprehensions


In [21]:
some_numbers = [1,2,3,4]

In [22]:
[x**2 for x in some_numbers]


Out[22]:
[1, 4, 9, 16]

File operations


In [23]:
with open('../README.md', 'r') as f:
    content = f.read()

In [24]:
print(content)


# HUB-ipython
##HUB (Heidelberg Unseminars in Bioinformatics) about IPython and chemoinformatics
This repository contains material for [HUB 21](http://www.hub-hub.de/wordpress/?tribe_events=hub-21-interactive-notebooks-try-out-programming-jupyteripython-chemoinformatics)
  
### Location and date
BioMed X, Im Neuenheimer feld 583, Heidelbeg, 21 January 2016

### How to run code from this repo
I recommend [Anaconda](https://www.continuum.io/downloads) Python distribution with Python 2.7. In Anaconda terminal type `jupyter notebook` and navigate to folder with notebooks.  
You can install [RDKit](http://www.rdkit.org/) (chemoinformatics library) with `conda install -c https://conda.anaconda.org/rdkit rdkit`
  
If you can't install Python on your computer you can use [tmpnb.org](tmpnb.org) to experiment with Jupyter (rdkit is not available in tmpnb.org).
  
Pair programming presentation can be converted to html and served with:
```bash
jupyter nbconvert Pair\ programming.ipynb --to slides --post serve --ServePostProcessor.port=8910
```
#### Current plans are to include this:

* short intro about BioMed X
* an ice-breaker - maybe the 'classic' standing in a line/grid to describe our experience coding, which we can also use for teaming people up for later activities - a group of experienced coders goes to the part of the room where people haven't done any coding before, and we find experienced partners for each of them
* a demo from Samo of IPython in chemoinformatic context
* maybe someone else demos something similar using a different language (R? Perl?)
* pair-programming together, using examples from [Rosalind](http://rosalind.info/problems/locations/)

angle we'd use to publicise the event would be:

* never coded before? come and try out coding with experts, to see what it's like
* interested in seeing what iPython/jupyter can do? come along and check it out - especially if you're into chemoinformatics - note that similar stuff is available for other languages
* chance to try out pair programming

For this, we'll ideally have a bunch of people joining the event who will bring their own laptops, have appropriate software set up and running, so that we have enough laptops to pair-program with everyone - if not, maybe we try doing it in groups of 3. 

### TODO
- [ ] prepare intro notebook
- [X] install instructions
- [ ] list optional chemo/bioinfo and visualisation libs
- [X] check how to make jupyter work with other programming languages
- [ ] guest wifi vouchers 
- [ ] buy beer

Functions


In [25]:
def average(numbers):
    return float(sum(numbers)/len(numbers))

In [26]:
average([1,2,2,2.5,3,])


Out[26]:
2.1

In [27]:
map(average, [[1,2,2,2.5,3,],[3,2.3,4.2,2.5,5,]])


Out[27]:
[2.1, 3.4]

In [28]:
# %load cool_events.py
#!/usr/bin/env python
from IPython.display import HTML

class HUB:
    """
    HUB event class
    """
    def __init__(self, version):
        self.full_name = "Heidelberg Unseminars in Bioinformatics"
        self.info = HTML("<p>Heidelberg Unseminars in Bioinformatics are participant-"
            "driven meetings where people with an interest in bioinformatics " 
            "come together to discuss hot topics and exchange ideas and then go "
            "for a drink and a snack afterwards.</p>")
        self.version = version
    def __repr__(self):
        return self.full_name

In [29]:
this_event = HUB(21)

In [30]:
this_event


Out[30]:
Heidelberg Unseminars in Bioinformatics

In [31]:
this_event.full_name


Out[31]:
'Heidelberg Unseminars in Bioinformatics'

In [32]:
this_event.version


Out[32]:
21

Python libraries

Library is a collection of resources. These include pre-written code, subroutines, classes, etc.


In [33]:
from math import exp

In [34]:
exp(2) #shift tab to access documentation


Out[34]:
7.38905609893065

In [35]:
import math

In [36]:
math.exp(10)


Out[36]:
22026.465794806718

In [37]:
import numpy as np # Numpy - package for scientifc computing

In [38]:
#import pandas as pd # Pandas - package for working with data frames (tables)

In [39]:
#import Bio # BioPython - package for bioinformatics

In [40]:
#import sklearn # scikit-learn - package for machine larning

In [41]:
#from rdkit import Chem # RDKit - Chemoinformatics library

Plotting


In [42]:
%matplotlib inline

In [43]:
import matplotlib.pyplot as plt

In [44]:
x_values = np.arange(0, 20, 0.1)
y_values = [math.sin(x) for x in x_values]

In [45]:
plt.plot(x_values, y_values)


Out[45]:
[<matplotlib.lines.Line2D at 0x7fc5da94a550>]

In [46]:
plt.scatter(x_values, y_values)


Out[46]:
<matplotlib.collections.PathCollection at 0x7fc5da8a2890>

In [47]:
plt.boxplot(y_values)


Out[47]:
{'boxes': [<matplotlib.lines.Line2D at 0x7fc5da7eb5d0>],
 'caps': [<matplotlib.lines.Line2D at 0x7fc5da77a490>,
  <matplotlib.lines.Line2D at 0x7fc5da77aad0>],
 'fliers': [<matplotlib.lines.Line2D at 0x7fc5da784790>],
 'means': [],
 'medians': [<matplotlib.lines.Line2D at 0x7fc5da784150>],
 'whiskers': [<matplotlib.lines.Line2D at 0x7fc5ffcb3bd0>,
  <matplotlib.lines.Line2D at 0x7fc5da7ebe10>]}

In [ ]: